AITopics | finite markov decision process

Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

Neural Information Processing SystemsNov-21-2025, 15:06:37 GMT

In classical reinforcement learning agents accept arbitrary short term loss for long term gain when exploring their environment. This is infeasible for safety critical applications such as robotics, where even a single unsafe action may cause system failure or harm the environment. In this paper, we address the problem of safely exploring finite Markov decision processes (MDP). We define safety in terms of an a priori unknown safety constraint that depends on states and actions and satisfies certain regularity conditions expressed via a Gaussian process prior. We develop a novel algorithm, SAFEMDP, for this task and prove that it completely explores the safely reachable part of the MDP without violating the safety constraint. To achieve this, it cautiously explores safe states and actions in order to gain statistical confidence about the safety of unvisited state-action pairs from noisy observations collected while navigating the environment. Moreover, the algorithm explicitly considers reachability when exploring the MDP, ensuring that it does not get stuck in any state with no safe way out. We demonstrate our method on digital terrain models for the task of exploring an unknown map with a rover.

finite markov decision process, name change, safe exploration, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Add feedback

dbef234be68d8b170240511639610fd1-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 10:20:47 GMT

artificial intelligence, machine learning, similarity, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.06)
Africa > Benin (0.06)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reviews: Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

Neural Information Processing SystemsJan-20-2025, 16:25:20 GMT

The paper is well-written and clear. The proposed idea is interesting. I have the following comments/questions: 1) Does the Liptschiz assumption hold here with a probability or is it assumed to always hold? 2) Figure 1: should it be \bar{s}_2 instead of s_2 in the caption? The use of bar for non-sets is confusing. I do not see the need for the last intersection in Equation 4. 4) When you repeatedly apply Equation 4, the number of states that satisfy the safety constraint shrinks because you use Liptschiz in the worst scenario sense.

finite markov decision process, gaussian process, safe exploration, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Robots (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Add feedback

Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

Turchetta, Matteo, Berkenkamp, Felix, Krause, Andreas

Neural Information Processing SystemsFeb-14-2020, 15:43:58 GMT

In classical reinforcement learning agents accept arbitrary short term loss for long term gain when exploring their environment. This is infeasible for safety critical applications such as robotics, where even a single unsafe action may cause system failure or harm the environment. In this paper, we address the problem of safely exploring finite Markov decision processes (MDP). We define safety in terms of an a priori unknown safety constraint that depends on states and actions and satisfies certain regularity conditions expressed via a Gaussian process prior. We develop a novel algorithm, SAFEMDP, for this task and prove that it completely explores the safely reachable part of the MDP without violating the safety constraint.

finite markov decision process, gaussian process, safe exploration, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.63)

Add feedback

Filters

Collaborating Authors

finite markov decision process

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

dbef234be68d8b170240511639610fd1-Supplemental-Conference.pdf

Reviews: Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

Safe Exploration in Finite Markov Decision Processes with Gaussian Processes